Suppression distance computation for hierarchical clusterings

نویسندگان

  • François Queyroi
  • Sergey Kirgizov
چکیده

We discuss the computation of a distance between two hierarchical clusterings of the same set. It is defined as the minimum number of elements that have to be removed so the remaining clusterings are equal. The problem of distance computing was extensively studied for partitions. We prove it can be solved in polynomial time in the case of hierarchies as it gives birth to a class of perfect graphs. We also propose an algorithm based on recursively computing maximum assignments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MultiDendrograms: Variable-Group Agglomerative Hierarchical Clusterings

MultiDendrograms is a Java-written application that computes agglomerative hierarchical clusterings of data. Starting from a distances (or weights) matrix, MultiDendrograms is able to calculate its dendrograms using the most common agglomerative hierarchical clustering methods. The application implements a variable-group algorithm that solves the non-uniqueness problem found in the standard pai...

متن کامل

Measuring the Quality of Approximated Clusterings

Clustering has become an increasingly important task in modern application domains. In many areas, e.g. when clustering complex objects, in distributed clustering, or when clustering mobile objects, due to technical, security, or efficiency reasons it is not possible to compute an “optimal” clustering. Recently a lot of research has been done on efficiently computing approximated clusterings. H...

متن کامل

Optimization and Simplification of Hierarchical Clusterings

Clustering is often used to discover structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. In general, a search strategy cannot both (1) consistently construct clusterings of high quality and (2) be computationally inexpensive. However, we can partition the search so that a sys...

متن کامل

Which, When, and How: Hierarchical Clustering with Human-Machine Cooperation

Human–Machine Cooperations (HMCs) can balance the advantages and disadvantages of human computation (accurate but costly) and machine computation (cheap but inaccurate). This paper studies HMCs in agglomerative hierarchical clusterings, where the machine can ask the human some questions. The human will return the answers to the machine, and the machine will use these answers to correct errors i...

متن کامل

A General Paradigm for Fast, Adaptive Clustering of Biological Sequences

There are numerous methods that compute clusterings of biological sequences based on pairwise distances. This necessitates the computation of O(n) sequence comparisons. Users usually want to apply the most sensitive distance measure which normally is the most expensive in terms of runtime. This poses a problem if the number of sequences is large or the computation of the measure is slow. In thi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Process. Lett.

دوره 115  شماره 

صفحات  -

تاریخ انتشار 2015